Feature Selection in a French Memd Language Model
نویسنده
چکیده
منابع مشابه
A Maximum Entropy/Minimum Divergence Translation Model
I present empirical comparisons between a linear combination of standard statistical language and translation models and an equivalent Maximum Entropy/Minimum Divergence (MEMD) model, using several diierent methods for automatic feature selection. The MEMD model signiicantly outperforms the standard model in test corpus per-plexity, even though it has far fewer parameters.
متن کاملA Comparison of Criteria for Maximum Entropy/ Minimum Divergence Feature Selection
In this paper we study the gain a naturally arising statistic from the theory of memd modeling as a gure of merit for selecting features for an memd language model We compare the gain with two popular alternatives empirical activation and mutual information and argue that the gain is the preferred statistic on the grounds that it directly measures a fea ture s contribution to improving upon the...
متن کاملA Maximum Entropy/minimum Divergence Translation Model
I present empirical comparisons between a standard statistical translation model and an equivalent Maximum Entropy/Minimum Divergence (MEMD) model, using several diierent methods for automatic feature selection. Results show that the MEMD model signiicantly outperforms the standard model in test corpus perplexity, even though it has far fewer parameters.
متن کاملAn "AI readability" Formula for French as a Foreign Language
This paper present a new readability formula for French as a foreign language (FFL), which relies on 46 textual features representative of the lexical, syntactic, and semantic levels as well as some of the specificities of the FFL context. We report comparisons between several techniques for feature selection and various learning algorithms. Our best model, based on support vector machines (SVM...
متن کاملAn Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کامل